# Automated Data Processing Pipeline {#sec:Appx-Pipeline}

This appendix describes the automated pipeline we built to ingest
raw data, process it to construct aggregate statistics, and release
those statistics publicly. This automated pipeline typically allows
us to post updated statistics within one business day of receiving
the raw data. By automating the data processing to the extent possible,
we aim to post data as close to real-time as possible, while maintaining
the quality of the data and minimizing the manual upkeep required.
The primary source of lags in the posted data is therefore driven
by lags in the underlying data generating processes: for example,
card transactions can take up to a week to settle and employment income
is typically paid in bi-weekly or monthly payrolls. We summarize our
data engineering methods here for those who may be interested in setting
up similar infrastructure in other contexts.

*Step 1: Data Ingestion.* To flexibly accommodate diverse data
sources with varying secure file transfer methods and update frequencies,
we operate a cloud server that regularly pulls updated data from each source. 
We receive data updates from private companies on a daily, weekly, or monthly cadence. 
Many companies have unique policies and requirements for securing data transfers, so we write
scripts to intake this data using a variety of secure file transfer
services (e.g. Amazon S3 buckets and SFTP servers). We also download
or scrape a variety of publicly available statistics from the web,
such as unemployment insurance claims and COVID-19 case counts.

Three main challenges arise when handling this large volume of frequently
updated data: storing, syncing, and version controlling the data we
receive. We store all the raw data we receive as flat files in a data
lake (an Amazon S3 bucket). We use object storage rather than a database
or a more customized storage service (such as Git LFS) to minimize
storage costs while maximizing our flexibility to ingest incoming
data which arrives in numerous formats that may change over time.
We version control each snapshot of the data we download within the
same Git repository that stores our code using a tool called [DVC](https://dvc.org/)
(“Data Version Control”). DVC creates a pointer to a hash of the
raw data for each data file or folder (in other words, a shortcut
to the files in the data lake), which we version control in Git and
update every time new data is downloaded. This associates each snapshot
of data with the code that existed at the time it was processed, and
allows us to easily roll back our code and data simultaneously to
any prior state. DVC also facilitates syncing the raw data from the
data lake by efficiently downloading the data that is associated with
each pointer in the Git repository.

*Step 2: Data Processing.* For each dataset, we have an automated
pipeline of programs that process and transform the raw data into
the public datasets that we post online. We use an automated build
tool to organize and execute this collection of programs. We mostly
process the data using Stata and execute our automated builds within
Stata using the -project- command developed by Robert Picard.

This data processing step generates two outputs: (1) a set of CSV
files that contain all the data to be posted publicly and (2) a quality
control report. The quality control report is a document that allows
analysts to quickly assess any notable deviations in the data and
determine whether the updated data require further review before being
publicly released. Each report flags three types of changes that would
require manual review: revisions made to previously posted data, large
deviations in newly reported data, or newly missing data. The report
also contains a series of tables and figures that preview the data
and highlight any changes in the newly processed data.

Each time new data is ingested, the data processing step is run automatically.
If it runs to completion, a Git pull request is generated with DVC
pointers to the newly updated raw data alongside a link to the quality
control report. If the data processing fails (for example, because
the structure of the raw data has changed), an error report is generated.
At this point, we pause and perform a manual review before posting
the new data online. If the data processing failed or if any changes
were detected in the quality control report that require further review,
we manually investigate and write new code as needed, then re-process
the data and inspect the updated quality control report before proceeding.

After reviewing and approving the quality control report, we merge
the Git pull request containing the new data, which automatically
triggers the final Data Release step. This manual review and approval
is therefore the only manual step in the data processing pipeline.

*Step 3: Data Release.* Once the processed data is ready for
release, our scripts automatically post the updated data to two public
destinations. First, we sync the updated data into the database powering
our online data visualization website built by DarkHorse Analytics
([tracker.opportunityinsights.org](tracker.opportunityinsights.org)). While doing so, we also update
the “last updated” and “next expected update” dates on the
website. Second, we upload the CSV files containing all the updated
data to our [data downloads](https://github.com/OpportunityInsights/EconomicTracker) page. The updated visualizations
and data downloads are then both immediately available for public
use.

# Consumer Spending Series Construction {#sec:Appx-Affinity}

## Structure of Initial Data

We receive data from Affinity Solutions in cells corresponding to
the intersection of (i) county, (ii) income quartile, (iii) industry,
(iv) day and week. Cells where fewer than five unique cards transacted
are masked. Income quartile is assigned based on ZIP code of residence
using 2014-2018 ACS estimates of median household income. We use population
weights when defining quartile thresholds so that each income quartile
includes the same number of individuals nationally. County and ZIP
code income quartile are both determined by the cardholder's residence.

## Internal Data Processing

### *Adjusting for sharp changes in the client base* {-}

The raw Affinity data have discontinuous breaks caused by entry or
exit of card providers from the sample. We identify these sudden changes systematically 
by regressing the number of weekly county-level transacting cards on the date, then 
implementing a Supremum Wald test for a structural break at an unknown break point. 
This test is implemented separately for the pre-pandemic period before March 11, 2020 and the post-pandemic period after that date.
We apply a correction to structural breaks where the p-value of the test is less than $5\times10^{-10}$.

For counties with a break below this threshold in only one of the two periods, we correct
our estimates as follows. We first compute the state-level week-to-week
percent change, excluding all counties with a structural
break (using the national series for DC and states for which all counties
have a structural break). If we identify a structural break in week
$t$, we impute the county-level percent change with the state-level percent 
change in weeks $t-1$, $t$, and $t+1$, as we cannot ascertain the precise date
when the structural break occurred (e.g., it may have occurred on the 2nd day of week $t-1$
or the 6th day of week $t$).

For example, suppose a county has $n$ active cards up until week
$t$, when the number of cards in the county increases to $3n$. In
week $t-2$, the county would have a level of $n$ cards, its reported
value. In week $t-1$, if counties in the rest of the state had a
5% increase in the number of cards, we would impute the county with
a break to have a level of $1.05n$ cards. In week $t$, if counties
in the rest of the state had a 10% increase in the number of cards,
we would impute $t$ to have a level of $(1.10)\times(1.05n)=1.155n$.
Likewise, if counties in the rest of the state had an 8% decrease
in the number of cards in week $t+1$, we would impute $t+1$ to have
a level of $(0.92)\times(1.155n)=1.0626n$. Finally, if the county with 
a break had an increase of 3% in week $t+2$, this would be multiplied by the 
imputed level for week $t+1$ to produce a level of $(1.03)\times(1.0626n)=1.0945n$
in week $t+2$. This final step would be repeated to update levels for all 
periods after week $t+2$.

When there is a change in coverage, we
adjust the series to be in line with the lower level of coverage.
In cases where the lower level of coverage was after the break, the above
example would implemented in reverse to impute levels prior to the break.

We omit counties with multiple structural breaks (identified either by the 
Supremum Wald test in both periods or manually) from our series.
We also add manually identified structural breaks which are not identified
by the test, for example because the break phases in
over more than one week. We do not remove any counties where the structural
break occurred between March 10 and 31, 2020 because the
consumer spending response to the COVID-19 was so strong that in many
places it could be classified as a structural break. Additionally,
since holiday spending spikes are also sometimes classified as a structural
break, we manually verify any breaks detected in the week of Thanksgiving
or the weeks immediately before, during or after Christmas.

### *Adjusting for gradual changes in the client base* {-}

In our processing, we aim to isolate and remove variation in spending
driven by the number of debit/credit cards in our sample changing
due to steady growth or shrinkage in a payment provider's customer
base. However, we also aim to incorporate extensive margin variation
in spending driven by the number of debit/credit cards in our sample
changing due to changes in the utilization of cards, which can reflect
underlying economic conditions. To account for both, we first estimate
a two-state switching model of the relationship between [Personal Consumption Expenditure](https://fred.stlouisfed.org/series/PCE)
as measured by the U.S. Bureau of Economic Analysis and the level
of credit and debit spending and number of cards in usage we received
from Affinity Solutions. Our estimates imply a state change between
February and August 2020: the number of cards in use is predictive
of Personal Consumption Expenditures during this period, but statistically
insignificant outside this period. This is consistent with a pattern
where people exhibited extensive margin changes in card spending during
the initial waves of COVID-19, but outside of that period changes
in the number of cards being used reflect changes in the number of
people in-sample as opposed to changes in the propensity to consume
of a given set of people.

We therefore construct an estimate of spending that reflects total
spending between February and August 2020, and spending per card outside this period.
For each location ($l$), industry ($i$) and time ($t$) we compute:

```{=latex}
\begin{align*} \text{adjusted\_spending}_{l,i,t}= \begin{cases}\frac{spending_{l,i,t}}{cards_{l,\cdot,t}}\times\overline{cards_{l,\cdot,t'\in\text{Jan 2020}}}\text{ if } t\leq \text{January 2020}\\\\\frac{spending_{l,i,t}}{cards_{l,\cdot,t}}\times\overline{cards_{l,\cdot,t'\in\text{Jan 2020}}}\times\frac{\overline{cards_{l,\cdot, t' \geq t \mid t'\in\text{Feb 2020}}}}{\overline{cards_{l,\cdot, t'\in\text{Feb 2020}}}}\text{ if } t\in \text{February 2020}\\\\\frac{spending_{l,i,t}}{cards_{l,\cdot,t}}\times\overline{cards_{l,\cdot,t'\in\text{Jan 2020}}}\times\frac{cards_{l,\cdot, t}}{\overline{cards_{l,\cdot, t'\in\text{Feb 2020}}}}\text{ if } t\in \text{[March 2020, July 2020]}\\\\\frac{spending_{l,i,t}}{cards_{l,\cdot,t}}\times\overline{cards_{l,\cdot,t'\in\text{Jan 2020}}}\times\frac{\overline{cards_{l,\cdot, t' \leq t \mid t'\in\text{Aug 2020}}}}{\overline{cards_{l,\cdot, t'\in\text{Feb 2020}}}}\text{ if } t\in \text{August 2020}\\\\\frac{spending_{l,i,t}}{cards_{l,\cdot,t}}\times\overline{cards_{l,\cdot,t'\in\text{Jan 2020}}}\times\frac{\overline{cards_{l,t'\in\text{Aug 2020}}}}{\overline{cards_{l,\cdot, t'\in\text{Feb 2020}}}}\text{ if } t\geq \text{September 2020}\end{cases} \end{align*}
```

We use this adjusted spending series throughout our analysis, except
in Section \ref{subsec:Stimulus-Eval}, where we always use total spending (thereby including
both intensive and extensive margin changes) in order to ensure consistency
in comparisons of our estimates across different stimulus rounds.

### *Addressing spurious changes* {-}

There is an large spike in consumer spending between January 15 and 17, 2019 that is not found in other data series.
This spike in national consumer spending is likely driven by the early
release of February 2019 SNAP benefits due to an impending government
shutdown.[^food_stamps] In order to avoid contaminating our seasonal adjustment with this
one-time shock, we replace each impacted day with the average spending
on $t-7$, $t+7$, and $t+14$, where $t$ is the impacted day.

[^food_stamps]: Described in “Billions in food stamp payments to come early because of shutdown”, 
Politico article on January 11, 2019: 
[https://www.politico.com/story/2019/01/11/shutdown-food-stamp-scramble-benifits-1081210](https://www.politico.com/story/2019/01/11/shutdown-food-stamp-scramble-benifits-1081210) 

### *Control for seasonal fluctuations in spending* {-}

We seasonally adjust the data by calculating, for each week and day,
the year-on-year change relative to the 2019 value. We norm February
29, 2020 (a Saturday) relative to the average of February 23 and March
2, 2019 (both Saturdays). Labor Day in 2019 fell one week earlier
than in 2020, so we adjust the week of Labor Day, as well as the two
weeks before, based on the same week in 2019 relative to Labor Day
rather than the week number in the year. We then calculate 
the change relative to the January index period: 2019 data is indexed relative
to January 7 to February 3, 2019, data in 2020 onward is indexed
relative to January 6 to February 2, 2020. We then seasonally
adjust by dividing by the indexed 2019 value, which represents the
difference between the change since January 2020 compared to the change
since January 2019. 

## Masking and Publication {#subsec:Data-Affinity-Masking-and-Publication}

Cells with fewer than 5 card transactions are masked. Additionally,
because the Supremum Wald test cannot identify breaks in the most
recent few weeks of data, we mask particularly large changes. If a
week-on-week change is $>4$x the median change of that
series, and $<1/3$ of all series have a large change defined
in this way, the value of the change is replaced with the preceding
week multiplied by the national average change. We place this 1/3
restriction to avoid overmasking holiday spending, when almost all
series experience large changes.

### *Definition of Categories of Goods* {-}

In parts of our analysis, we distinguish between four categories of
goods and services: durable goods; non-durable goods; remote services;
and in-person services.
```{=latex}
\begin{itemize}
\item We define durable goods as the following MCC groups: Building materials,
garden equipment, and supplies; electronics and appliances; furniture
and home furnishings; motor vehicles and parts; sporting goods, hobbies,
musical instruments, and book stores; and telecommunications.
\item We define nondurable goods as the following MCC groups: General merchandise;
wholesale trade; clothing and clothing accessories; health and personal
care stores; food and beverage stores; misc store retailers; and gas
stations.
\item We define remote services as the following MCC groups: Utilities,
construction, and manufacturing; professional, scientific, and technical
services; public administration; administrative and support and waste
management and remediation services; information; education; finance;
and nonstore retailers.
\item We define in-person services as the following MCC groups: Rental and
leasing; repair and maintenance; and personal and laundry services.
\end{itemize}
```

### *Substitution from Card to Cash Spending* {-}

A potential concern with our card-based estimates of spending changes
is bias from substitution out of cash purchases, which account for
6.3% of consumer spending in the United States [@diaryconsumerpayments].
For instance, if individuals sought to use more contactless methods
to pay or began placing more orders online, trends in card spending
might exhibit excess volatility relative to overall spending. To assess
the importance of such substitution, we examine cash purchases using
receipts data from CoinOut, a company that allows individuals to receive
rewards by uploading photos of their receipts to a mobile app. We
focus on grocery spending in the card data because cash spending in
CoinOut is concentrated in certain sectors such as groceries; unfortunately,
we are unable to disaggregate the CoinOut data by sector or align
sectoral definitions more precisely across the datasets.

Appendix Figure \ref{fig:affinity_coinout} plots week on week changes
in aggregate cash purchases in the CoinOut data vs. aggregate card
spending at grocery stores over time. The time trends are very similar
between the two series (with a correlation of {{aff_coinout_corr}}
at the weekly level), showing a sharp spike in spending in late March 2020
(as households stocked up on groceries), followed by a more sustained
increase in spending from the latter half of April 2020. These results
suggest that households shifted spending similarly across both modes
of payment.

## Benchmarking

*Comparison to QSS and MARTS.* Total debit and credit card
spending in the U.S. was $7.08 trillion in 2018 [@BoGFED2019],
approximately 50% of total personal consumption expenditures recorded
in national accounts. Appendix Figure \ref{fig:ind_shares} compares
the spending distributions across sectors in the Affinity data to
spending captured in the nationally representative Quarterly Services
Survey (QSS) and Advance Monthly Retail Trade Survey (MARTS), which
together cover 92% of the expenditure-weighted categories in the
Affinity data. The Affinity series has broad coverage across industries,
but over-represents categories in which credit and debit cards are
used for purchases. In particular, accommodation and food services
and clothing constitute a greater share of the card spending data
than financial services and motor vehicles. We therefore view the
Affinity series as providing statistics that are representative of
total card spending, but not total consumer spending. We assess whether
the Affinity series accurately captures changes in total card spending
in the COVID recession in Section \ref{subsec:Impacts-Spending}.

# Small Business Revenue and Openings Series Construction {#sec:Data-Womply}

## Structure of Initial Data {#subsec:Womply-Data-Structure}

We receive total small business debit and credit revenue data from
Womply, where the total revenue data is aggregated from settled credit
card transactions that Womply receives from payment processing partners.
Additionally, we receive the number of businesses open data from Womply
where the number of businesses open is defined as the number of businesses
making at least one transaction within a three day window. The primary
raw data we receive and process is a 52-week “no-entry” panel
of firms at the county by sector by week level, which is a repeated
panel following the set of firms operating in each week $t$ over
the subsequent 52 weeks (with attrition from the sample but no entry during the panel).
To construct ZIP-level estimates, we additionally use a cross-sectional
sample of firms at the ZIP code by ZIP income quartile by sector by
day level. We measure small business revenue as the sum of all credits
(generally purchases) minus debits (generally returns). All transactions
and derived data we receive are tied to the ZIP code or county containing
the business. The sample is limited to small businesses as 
[defined by the Small Business Administration](https://www.sba.gov/sites/default/files/2022-07/Table\%20of\%20Size\%20Standards_Effective\%20July\%2014\%202022_Final-508.pdf).

## Internal Data Processing

### *Adjusting for the evolving client base* {-}

In each calendar year, we follow the sample of businesses operating
during the first week of the year (i.e. we start following a new panel
each calendar year). No new businesses enter our panel during the
calendar year. Businesses may exit because they stop operating or
because the underlying payment processors ceased providing data.

We detect cases where a payment processor disappears by detecting
sharp drops in businesses operating, at the national and the state
level. We then adjust the series to identify and correct breaks in
the data introduced by these merchant exits from the sample. To do so,
we identify breaks in series from July 2020 to present
as downward discontinuities of at least 2.5 percentage points from week $t-1$
that persist for at least two weeks—i.e. remain at least 2.5 percentage points
below week $t-1$ in weeks $t$, $t+1$ and $t+2$. For weeks prior to July 2020
we manually identify breaks in weeks 21, 22, 27, 31 nationally and
at week 25 for a subset of states (where week numbers refer to weeks
of 2020).

For breaks identified prior to July 2020, business revenue was in
a period of strong recovery so we assume “momentum” during the
weeks with missing data. We calculate the average rate of change for
the 4 preceding weeks to the discontinuity and use that value to impute
the rate of change for the identified week of the discontinuity. For
breaks identified from July 2020 to present we adjust a given merchant
or sales series as follows. For a merchant series, for the week of
a discontinuity we impute the value of the number of merchants as
the value from week $t-1$ and adjust the following weeks accordingly
resetting the number of merchant to the level prior to the identified
exit from the sample. For a sales series, if week $t-1$ has a discontinuity
we compute the change in sales between week $t-1$ and week $t-2$
to be zero and adjust the following weeks accordingly. If additionally
week $t$ has a discontinuity we correct week $t$ following the same
procedure and adjust the following weeks accordingly.

### *Produce the ZIP code series* {-}

As described in Appendix \ref{subsec:Womply-Data-Structure}, we do
not receive any panel data disaggregated to the ZIP code level: we
only observe cross-sectional ZIP level data. We therefore perform
an additive adjustment on the cross-sectional ZIP level series so
that the weighted sum of the processed ZIP series aligns with the
county-level “same store” panel data.
The resulting data obtains the levels from the county “no-entry” panel,
and the within-county across-ZIP variation from the ZIP level cross-section.

### *Reducing the influence of outliers* {-}

To reduce the influence of outliers, firms outside twice the interquartile
range of firm annual revenue within this sample are excluded and the
sample is further limited to firms with 30 or more transactions in
a quarter and more than one transaction in 2 out of the 3 months.

We also manually exclude some state x industry breakdowns that present
extreme variation from our state and national level calculations,
as well as a small number of counties that demonstrate extreme variation.

### *Controlling for seasonality* {-}

We seasonally adjust reported revenue by calculating the year on year
change relative to the corresponding value in 2019. We calculate the
change relative to the January index period: 2019 data is indexed
relative to January 2019, data in 2020 onward is indexed relative
to January 2020. We then seasonally adjust by dividing by the indexed
2019 value.

## Masking and Publication

To preserve the privacy of firms in the data and to avoid displaying
noisy estimates for small cells, we mask cells with less than $250,000 in 
total revenue during the base period of January 4 to 31, 2020. 
The ZIP-level data we receive from Womply adds merchants and
an imputed revenue quantity such that every cell with 1 or 2 merchants
has no fewer than 3 merchants. This imputation has the result of dampening
the effect of any declines that would otherwise place the number of
merchants in a cell at 1 or 2, lowering the effect of any increase
from 1 or 2 merchants to 3 merchants, and enhancing the effect of
any increase from 0 merchants to 1 or 2 merchants. We address the
effects of this imputation by dropping any ZIP code that has imputed
values for more than 25% of the weeks in our period of study.

## Benchmarking

*Comparison to QSS and MARTS.* Appendix Figure \ref{fig:ind_shares}
shows the distribution of revenues observed in Womply across industries
in comparison to national benchmarks. Womply revenues are again broadly
distributed across sectors, particularly those where card use is common.
A larger share of the Womply revenue data come from industries that
have a larger share of small businesses, such as food services, professional
services, and other services, as one would expect given that the Womply
data only cover small businesses.

# Job Postings Series Construction {#sec:Data-Burning-Glass}

## Structure of Initial Data

We receive a dataset containing the weekly number of unique new job postings,
defined as those that have not already been posted within a 60 day
window. This data is disaggregated by geography (county, state, national),
by 2-digit NAICS code, and by ONET code. Jobs postings are sourced
by Lightcast (formerly known as Burning Glass Technologies) from over
40,000 jobs boards worldwide.

## Internal Data Processing

### *Reducing the influence of outliers* {-}

In order to avoid extreme outliers, we calculate a cutoff of one standard
deviation above the 97th percentile of the state-level data for each
variable and mask values that exceed this threshold.

## Masking and Publication

We perform some imputations of the county by subgroup-level data to
ensure the privacy of the underlying job postings dataset, as required
by our data use agreement. In the 200 most-populated counties, total
job posts by subgroup are reported directly without imputation. For
total job posts by subgroup in smaller counties, we impute the number
of job postings: we report the total county level postings by subgroup
multiplied by the share of total state level postings of the corresponding
subgroup. All state-level data and national data are reported without imputations.

## Benchmarking

*Comparison to JOLTS.* Lightcast data have been used extensively
in prior research in economics; for instance, see @HershbeinKahn_RecessionsTechChange
and @BurningGlassDemingKahn. @carnevale2014understanding
show that the Lightcast data are reasonably well-aligned with government
survey-based statistics on job openings and characterize the sample
in detail. In Appendix Figure \ref{fig:bg_jolts}, we compare the
distribution of industries in the Lightcast data to nationally representative
statistics from the Bureau of Labor Statistics’ Job Openings and Labor
Market Turnover Survey ([JOLTS](\https://www.bls.gov/jlt/)) in
January 2020. In general, Lightcast is well aligned across industries
with JOLTS, with the one exception that it under-covers government
jobs. We therefore view Lightcast as a sample representative of private
sector jobs in the U.S.

# Employment Series Construction {#sec:Data-Employment}

## Structure of Initial Data

The employment series is constructed with data from three data providers:
Paychex, Intuit, and Earnin.

### *Paychex* {-}

We obtain aggregated weekly data on total employment for each county
by industry (two-digit NAICS), hourly wage quartile (defined
below), firm size bin and pay frequency. Salaried employees' wages
are translated to hourly wages by dividing weekly pay by 40 hours.
To measure private sector employment, we exclude workers employed
in public administration and those with an unclassified industry (which
each represent 0.8% of workers as of January 2020). We restrict the
sample to workers with weekly, bi-weekly, semi-monthly or monthly
pay frequencies; these workers represent over 99.8% of employees
in the Paychex data.

To classify workers by wage quartile while adjusting for wage growth,
we construct moving wage quartile thresholds based on 100%, 150%
and 250% of the federal poverty line (FPL). The FPL is defined as
an annual income, which we convert into a full-time-equivalent hourly
wage by diving by 2000 hours (50 weeks of work at 40 hours per week).
In our benchmark period of January 2020, the thresholds are ${{wage_threshold_q1q2}},
${{wage_threshold_q2q3}} and ${{wage_threshold_q3q4}}.
These thresholds group workers approximately into quartiles: in January
2020, the four bins in ascending order by wage contain {{pct_emp_cps_q1}}%,
{{pct_emp_cps_q2}}%, {{pct_emp_cps_q3}}%, and {{pct_emp_cps_q4}}%
of CPS respondents. Since the FPL is set annually at the beginning
of each year, we estimate monthly thresholds within each year by multiplying
the FPL by the growth in the monthly Consumer Price Index (CPI) since
January. When the official FPL estimate is published at the beginning
of the subsequent year, its growth does not exactly match the growth
in CPI. To maintain consistency with the levels of the FPL, we revise
the thresholds from the prior year when the FPL is released. We compute
the difference between our CPI-based thresholds and the official FPL
thresholds during the relevant year, divide this residual by 12, and
add the rescaled residual to the projected FPL in each month.

We provide these moving wage quartile thresholds to Paychex,
which are then used internally by Paychex to assign workers to an
hourly wage quartile in the aggregated weekly dataset we receive.
In January of each year, we provide revised quartile thresholds to
Paychex and revise our estimates for the prior year---the revised
estimates hold aggregate employment constant but reallocate some workers
across quartiles.

Using moving thresholds to assign workers to wage quartiles generates
occasional discontinuities in each quartile due to bunching in the
wage distribution at integers, as discussed in Section \ref{subsec:Data-Employment}
and detailed in Appendix \ref{subsec:Internal-Data-Processing}. To
correct these discontinuities, we receive weekly counts of the number
of employees earning an integer hourly wage for each integer value
between $13 and $25. These counts are received at the same level
of disaggregation as the weekly data on employment: county by industry
(two-digit NAICS), firm size bin and pay frequency.

### *Intuit* {-}

We obtain anonymized, aggregated data on month-on-month and year-on-year
changes in total employment (the number of workers paid in the prior
month) and average earnings at the state and county level by month,
based on repeated cross-sections. To develop a national series, we
take population-weighted averages of state changes in each month.

### *Earnin* {-}

We obtain anonymized data from Earnin at the paycheck level with information
on industry (2-digit NAICS), firm size, ZIP code, unemployment status,
wages, and earnings. The median worker in the Earnin sample in January
2020 has an hourly wage rate of ${{earnin_wage_p50}}, which falls at the {{cps_wage_p50}}th percentile
of the national distribution of private sector non-farm workers in
January 2020 CPS data. The interquartile range of wages in the Earnin
sample is \${{earnin_wage_p25}}-\${{earnin_wage_p75}} 
(corresponding to the {{cps_wage_p25}}th and {{cps_wage_p75}}th percentiles of the national distribution).

Industry and firm size are not directly measured in Earnin's administrative
databases, so these variables were attached to the anonymized paycheck-level
data by linking a list of employers with more than ten Earnin users
to external databases. To obtain information on industry, we use a
custom-built crosswalk created by Digital Divide Data which contains
NAICS codes for each employer in the Earnin data with more than ten
Earnin users. To obtain information on firm size, we crosswalk Earnin
employers to ReferenceUSA data at the firm location level by spatially
matching Earnin employers to ReferenceUSA firms. We begin by geocoding
Earnin addresses to obtain latitudes and longitudes for each Earnin
employer. We then remove common prefixes and suffixes of firm names,
such as “inc” and “associated”. Next, we compute the trigram
similarities between firm names for all Earnin and ReferenceUSA firms
within twenty-five miles of another. We then select one “match”
for each Earnin firm within the ReferenceUSA data, among the subset
of firms within one mile. We first match Earnin employers to ReferenceUSA
firms if the firms are within one mile of one another, and share the
same firm name. Second, where no such match is available, we choose
the geographically closest firm (up to a distance of one mile) among
all firms with string similarities of over 0.6. Third, where no such
match is available, we match an Earnin employer to the ReferenceUSA
employer within twenty-five miles with the highest trigram string
similarity, provided that the match has a trigram string similarity
of 0.9. We then compute the modal parent-firm match in the ReferenceUSA
data for each parent-firm grouping in Earnin. Where at least 80%
of locations within a parent-firm grouping in Earnin are matched to
a single parent-firm grouping in the ReferenceUSA data, we impute
that parent-firm to every Earnin location. In total, we match around
70% of Earnin employers to ReferenceUSA firms.

## Internal Data Processing {#subsec:Internal-Data-Processing}

Data is processed in the following ways for data received from the
three data providers: Paychex, Intuit, and Earnin.

### *Paychex* {-}

*Addressing anomalous spikes.* We manually identify large anomalous
spikes in the data and smooth these by interpolating values from adjacent
weeks at the county x 2-digit NAICS code x hourly wage quartile x
2019 firm size bin x pay frequency level.

*Adjusting for discontinuities when quartile thresholds cross integer wages.* 
We adjust for occasional discontinuities in employment
in each wage quartile due to bunching in the wage distribution. Our
adjustment is mathematically equivalent to adding a uniform random
variable distributed between $[-0.5,0.5]$ to whole number wages,
transforming the point mass of employees at the integer *wage*
into a uniform distribution between $[wage - 0.5, wage + 0.5]$. 
Rather than adding the random variable *ex ante* using employee microdata before aggregating, which
we did not access, we applied an equivalent procedure that is feasible
*ex post* after receiving the aggregated data described above
containing employee counts. As an example, suppose the threshold separating
the first wage quartile (Q1) and the second wage quartile (Q2) is
$13.75 in a given month. Then adding the uniform random variable
to the wages of employees earning $14 will move 25% of the point
mass below the threshold. Since we observe the counts on each side
of the $13.75 threshold and the count at the $14 point mass, we
add 0.25 $\times$ (number of employees at $14) to the Q1 count
and subtract the same number from the Q2 count. We perform the analogous
process for each threshold in each month, calculating the share of
the point mass at the integer value within $0.50 of the threshold
that should be moved across the threshold, and adding and subtracting
that value from the quartiles below and above the threshold accordingly.

*Mapping paychecks to employment periods.* To construct a series
of employment as of each date, we construct a series of pay periods
ending as of each date. We take a separate approach for paychecks
following regular weekly cycles (i.e. weekly and bi-weekly paychecks)
and for paychecks following a cycle based on fixed calendar dates
(i.e. semi-monthly and monthly paychecks). For weekly and bi-weekly
pay frequencies, we use data provided by Paychex on the distribution
of the number of days between a worker's pay date and the last date
in the worker's pay period (i.e., date at which payroll is processed
-- last date in pay period), for weekly and bi-weekly pay frequencies,
to distribute paychecks to the last date of the corresponding pay
period. We treat the distribution of (date at which payroll is processed
-- last date in pay period) as constant across geographies and NAICS
codes. For monthly and semi-monthly pay frequencies, where cycles
regularly occur on fixed calendar dates (e.g. the 15th and 30th of
each month for semi-monthly paycycles), we assume that the last date
within each pay period is the closest preceding calendar date that
is the 15th or the 30th day of the month (semi-monthly paycycles)
or the 30th day of the month (monthly paycycles). We then record a
worker as being employed for the full duration of the pay cycle up
until the last date in their pay period, under the assumption that
workers are employed for each day during their pay period.

*Adjusting for an evolving client base.* We take steps to adjust
for the entry and exit of Paychex clients from the sample. In each
county x industry (two-digit NAICS code) x firm size x wage quartile
cell, we compute the change in employment relative to January 4 to 31,
2020, and the change in employment relative to July 2020. Let
$\text{Min Normed July}_{c,i,s,q}$ be the smallest value of indexed
employment we observe at each date relative to its mean level over
July 2020 and $\text{Max Normed January}_{c,i,s,q}$
be the largest value of indexed employment we observe at each date
relative to its mean level over January 4 to 31, 2020.

For county x industry x firm size x wage quartile cells with at most
50 employees at all points between January 2020 and the end of the
series, we reduce the weight we place on the series if we observe
large changes in employment that indicate firm entry or exit. In particular,
we compute the weight on the cell for county $c$, industry $i$,
firm size $s$, and wage quartile $q$ as:
```{=latex}
\begin{align*}
\text{Weight}_{c,i,s,q}= & \text{max}\Big\{1-\mathbf{1}\{\text{Min Normed July}_{c,i,s,q}\leq50\}\times(50-\text{Min Normed July}_{c,i,s,q})\times0.02\\
 & -\mathbf{1}\{\text{Max Normed January}_{c,i,s,q}\geq4000\}\times(\text{Max Normed January}_{c,i,s,q}-4000)\times0.001,\\
 & 0\Big\}
\end{align*}
```

That is, we reduce the weight we place on the cell by two percentage
points for each percentage point of decline we observe below 50 percentage
points relative to July 2020. We further reduce the weight by 0.1
percentage points for each percentage point of growth we observe above
4000 percentage points relative to January 2020.

For county x industry x firm size x wage quartile cells with over
50 employees at any point between January 2020 and the end of the
series, we reduce the weight we place on the series using more stringent
restrictions, reflecting the fact that extreme growth rates are more
anomalous in larger cells. In particular, we compute the weight on
the cell for county $c$, industry $i$, firm size $s$, and wage
quartile $q$ as:
```{=latex}
\begin{align*}
\text{Weight}_{c,i,s,q}= & \text{max}\Big\{1-\mathbf{1}\{\text{Min Normed July}_{c,i,s,q}\leq50\}\times(50-\text{Min Normed July}_{c,i,s,q})\times0.02\\
 & -\mathbf{1}\{\text{Max Normed January}_{c,i,s,q}\geq600\}\times(\text{Max Normed January}_{c,i,s,q}-600)\times0.005,\\
 & 0\Big\}
\end{align*}
```

That is, we reduce the weight we place on the cell by two percentage
points for each percentage point of decline we observe below 50 percentage
points relative to July 2020. We further reduce the weight by 0.5
percentage points for each percentage point of growth we observe above
600 percentage points relative to January 2020.

In addition, we address the especially concentrated entry and exit
of firms from the sample at the end of each calendar year, where there
is significant churn as firms renew their payroll processing contracts.
Due to this seasonal pattern, the raw Paychex data display a downwards
trend in employment at the end of each calendar year as some clients
leave Paychex, followed by an upward trend in employment at the very
beginning of each calendar year as new clients join. To avoid this
source of error, we adjust for the end-of-year pattern in the Paychex
data using data from the end of 2019. For each date between November
15, 2020 and January 10, 2021, using Paychex data on employment at the
national level, we compute the change in employment relative to November
15, 2020 at the two-digit NAICS code x wage quartile level. We also
compute the change in employment between the corresponding day in
the previous year and November 15, 2019. We divide the change in employment
relative to November 15, 2020 by the corresponding change in employment
the previous year relative to November 15, 2019. To avoid a break in
the series at January 10, 2021, we adjust the series from January 10 onwards
using the adjusted level on January 10 and the unadjusted trend from
January 10. We repeat this end-of-year adjustment during each subsequent
year. We then apply the same adjustment to each two-digit NAICS code
x wage quartile cell at the state, county and city levels.

*Adjusting for minimum wage changes.* Finally, we address mismeasurement
of the change in employment in the first and second wage quartiles
due to minimum wage increases in California, Massachusetts, and New
York during our period of study. By increasing the minimum wage above
the threshold between the first and second wage quartiles at the end
of 2020 or start of 2021, these states mechanically shift workers
out of the first wage quartile into the second wage quartile, leading
to spurious changes in the quartile-specific series.[^min_wage_arizona] 
To address this issue, we make the following adjustments. We begin
by calculating the percentage change in the number of employees in
each state x industry (2-digit NAICS) cell from December 4, 2020 onwards
for the first two wage quartiles combined. From December 4, 2020 onwards,
for the first two wage quartiles in minimum wage change states, we
impute the trend in each county x industry x wage quartile cell using
the below-median income employment trend in their own state x industry
cell. Since employment in the first wage quartile recovered less than
employment in the second wage quartile in 2021, an imputation that
aggregates the trends in the first and second wage quartiles tends
to overstate the recovery in the first quartile and understate the
recovery in the second quartile. As such, we further rescale the state
x industry below-median income trend separately for each wage quartile,
using the coefficient from the regression of that quartile's employment
change on the below-median income employment change in non-minimum
wage-change states without a constant between December 4, 2020 and
December 3, 2021. We then aggregate the adjusted employment counts
to the relevant geographies (e.g. state; national) before calculating
the change in employment since January 2020.

[^min_wage_arizona]: Arizona also increased its minimum wage during this period, but its
minimum wage ($12.15 in 2021; $12.80 in 2022) remained well below
the threshold between the first and second wage quartiles ($13.25
in January 2021; $13.88 in January 2022), so we do not adjust the
employment series for this state.

### *Paychex and Intuit: Combined Employment Series* {-}

We combine Paychex and Intuit data to construct our primary employment
series. Our data sharing agreements requires us to produce combined
estimates in our public releases of employment data.

*Imputing local composition of Intuit data.* Because Paychex
is disaggregated by sector and covers all sectors and wage levels
fairly comprehensively, we use it as the base for the combined employment
series. We then use Intuit to refine the series in cells represented
by those datasets. Intuit provides us with overall national industry
shares as of 2019, but does not disaggregate the monthly employment
data we receive by wage level or industry. We therefore impute the
Intuit data to wage-industry cells before combining it with the Paychex
data. To do so, we assume that any differences in employment between
Intuit and Paychex are constant (in percentage terms, relative to
the January baseline) by industry and wage quartiles within a given
geography and month. For each date, we impute the number of Intuit
employees in a given geography x industry x wage quartile cell as
the number of Paychex employees in that cell reweighted by the national
Intuit industry distribution. We then weight the geography-level changes
in employment in the Intuit data by the imputed number of employees
in its geography x industry x wage quartile cell.

*Combining Paychex and Intuit data.* We take a weighted average
of the Paychex data and the imputed Intuit data to compute the final
combined series. We place the majority of the weight on Paychex, with
greater weight on Intuit in sectors where it has greater coverage;
the exact weights are undisclosed to protect privacy. We report seven-day
moving averages of these series, expressed as a percentage change
relative to January 4 to 31, 2020.

### *Earnin* {-}

*Sample Restrictions.* We restrict the sample to workers who
are active Earnin users, with non-missing earnings and hours worked
over the last 28 days. Next, we exclude workers whose reported income
over the prior 28 days is greater than $50,000/13 (corresponding
to an income of greater than $50,000 annually). We then restrict
the sample to workers who are in paid employment. Users may continue
to use Earnin after they have been laid off; we exclude payments which
Earnin classifies as unemployment payments, either based on the user's
registration with Earnin as being unemployed, or based on the string
description of the transaction.

Finally, we omit the first and last 32 weeks of data for each user
from our analysis to mitigate non-random entry and exit from the customer
base. We observe that a person's probability of finding or losing
a job is elevated near the time they entered or exited the sample---changes
in employment likely induce people to sign up for the service or reconsider
their need for the service. However, after omitting the first and
last 32 weeks, the probability of a user finding or losing a job is
no longer correlated with the number of weeks that elapse after entry
into the sample or before exiting the sample.

*Mapping paychecks to chained employment.* We construct an
employment series in the Earnin data from our analysis sample as follows.
In the paycheck-level data, we observe the worker's paycycle frequency.
As in the Paychex data, we use paycycle frequency to construct an
employment series by assuming that workers are employed throughout
the full duration of their paycycle. That is, we assume that a worker
paid every two weeks has been fully employed for the two weeks prior
to receiving their paycheck. We exclude workers with pay periods greater
than 3 weeks, omitting approximately 1% of the sample. To account
for the delay in receipt of paychecks, we shift the Earnin series
back by one week. We then take the count of employed individuals across
the Earnin sample as our measure of employment. We use that to calculate
an indexed measure of week-over-week employment percentage change,
for each geographic unit, indexed to their first week in the sample.
Then, the average percentage change in the four weeks from January
4 to 31, 2020 is set as the reference value and
indexed to zero. The employment series is then expressed as a change
relative to January 4 to 31, 2020. We suppress estimates for ZIP codes with
fewer than 10 paychecks observed over this period.

## Masking and Publication

In the Paychex and Intuit combined series, we suppress cells in a few cases where
Intuit data do not provide coverage for a given geographical region or industry. We
also suppress cells in which Paychex records fewer than 150 monthly
employees in January 4 to 31, 2020 at the geography x wage quartile or
geography x industry (2-digit NAICS) x wage quartile level, depending
on the series. When aggregating employment series to the geographical
level without breakdowns by industry or wage quartile, however, we
use data from all cells, without masking.

## Accounting for Low-Wage Employment Changes due to Wage Growth {#subsec:Appx-Employment-WageGrowth}

To calculate the share of low-wage employment changes due to wage
growth, we decompose the level of the bottom-quartile-wage series
from Figure \ref{fig:employment_changes_by_income_quartile} at the
end of December 2021 into the part explained by real wage growth and
not. We perform this decomposition by estimating ventile-specific
wage growth rates in the CPS, calculated as the month-to-month change
in ventiles, reweighting each monthly CPS sample to hold constant
the distribution of industry, occupation, race, gender, age, education,
region, and citizenship. This follows the procedure in @gould2022.
Under the assumption that all remaining wage changes within these
buckets reflect wage growth rather than changes in the nature of the
jobs themselves, this procedure will estimate wage growth at each
ventile of the wage distribution. There were extreme movements into
and out of the labor force between January and July 2020, for which
the reweighting may not adjust correctly. We thus employ our estimation
procedure to estimate wage growth only between July 2020 and December 2021. 
We impute wage growth rates between January and July
2020 using a 2.9% annualized wage growth rate (which prevailed in 2019). @grisby2021 suggest that
this is likely an upper bound on the wage growth actually experienced
during the early months of the pandemic, so that our overall wage
growth adjustment will also be an upper bound on the changes in bottom-wage-quartile
employment explained by wage growth.

With ventile-specific wage growth rates in hand, we then adjust the
distribution of wages in the January 2020 CPS and recalculate total
employment below the bottom-quartile-wage threshold for December 2021
($13.79). We additionally add uniform noise between \[-$0.50, $0.50\]
to whole number wages to smooth out spikes in the wage distribution
at whole numbers, to mirror our treatment of these integer-wage spikes
in the Paychex data (see Section \ref{subsec:Data-Employment} for
more details on this issue). Bottom-quartile-wage employment falls
in this counterfactual by {{notexpl_pp}}pp, reflecting the decline
not explained by real wage growth; the remaining {{expl_pp}}pp
deficit in December 2021 is the share explained by real wage growth.

## Benchmarking

*Comparisons to QCEW and OES.* Appendix Table \ref{tab:industry_share}
compares industry shares in each of the data sources above to nationally
representative statistics from the Quarterly Census of Employment
and Wages (QCEW). The Paychex-Intuit combined sample and the Earnin
sample are broadly representative of the U.S. industry mix. Appendix
Table \ref{tab:wage_oes} shows that wage rates in our data sources
are similar to nationally representative statistics from the BLS's
Occupational Employment Statistics. Overall, our combined datasets
appear to provide a representative portrait of private non-farm employment
in the United States.

# Unemployment Claims Series Construction {#sec:Data-UI}

## Structure of Initial Data

We collect state-level unemployment initial claims data from the Department
of Labor, Employment and Training Administration public data. We additionally
collect data on state-level PUA, PUEC and continued claims from the
Bureau of Labor Statistics public data. Finally, we gather county-level
unemployment claims data from 22 state agencies who publish their
state's data in various online formats, which we scrape and transform
into a standard format.

## Internal Data Processing

### *Harmonizing weekly and monthly estimates* {-}

In some cases, states only publish county-level data at the monthly
frequency. To align with the weekly series of most UI data we collect, 
we impute weekly values for counties with monthly reports
using the county-level monthly totals and the
state-level distribution of weekly claims in that month,
as published by the Department of Labor. This imputation implicitly
assumes that each county within the state had the same relative distribution
of UI claims within the month.

## Masking and Publication

We apply no further masking beyond any masking applied by state and
national agencies.

# COVID-19 Infections and Vaccinations Series Construction

## Structure of Initial Data

We collect publicly available data on cases and deaths reported by
the [New York Times](https://github.com/nytimes/covid-19-data) and the [Centers for Disease Control and Prevention](https://covid.cdc.gov/covid-data-tracker/\#datatracker-home),
publicly available data on hospitalizations from COVID-19 from the
[U.S. Department of Health and Human Services](https://beta.healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/g62h-syeh),
publicly available data on the number of vaccine doses administered
from the [Centers for Disease Control and Prevention](https://covid.cdc.gov/covid-data-tracker/\#datatracker-home),
and publicly available data on the number of COVID-19 tests from [Johns Hopkins University](https://github.com/govex/COVID-19). 

## Internal Data Processing

### *Reducing the influence of outliers* {-}

We manually review any spikes in cases, tests, or deaths that are
larger than 25%. If news reports suggest that the spike is a reporting
artifact, we smooth the data by imputing a value for the day of the
spike using the growth rate in the outcome on the prior day.

## Masking and Publication

We apply no further masking beyond any masking applied by state and
national agencies.

# Time Outside Home Series Construction {#sec:Data-Mobility}

## Structure of Initial Data

We collect publicly available data on changes in time spent at various
locations from Google's COVID-19 Community Mobility Reports to construct
measures of daily time spent at parks, retail and recreation, grocery,
transit locations, and workplaces. Additionally we collect publicly
available time use data from the American Time Use Survey.

## Internal Data Processing

### *Generating measures of time at home and away from home* {-}

We use the American Time Use Survey to measure the mean time spent
inside the home (excluding time asleep) and outside the home in January
2018 for each day of the week. We combine the data on time spent in
and outside the home with Google's data on changes in time spent at
and away from home. To do so we multiply the time spent inside the home
in January 2018 with Google's percent change in time spent at residential
locations to get an estimate of time spent inside the home for each
date. The remainder of waking hours in the day provides an estimate
for time spent outside the home.

## Masking and Publication

Google does not release data for geographies where their 
[internal quality and privacy thresholds](https://www.google.com/covid19/mobility/data_documentation.html?hl=en\%5C\#about-this-data)
are not met.

# Educational Progress Series Construction {#sec:Data-Zearn}

[Zearn](https://about.zearn.org/) is a non-profit math curriculum
publisher that combines in-person instruction with digital lessons.
Zearn was used by approximately {{n_students}} students in the U.S. in Spring 2020. 
Many schools continued to use Zearn as part of their math curriculum
after COVID-19 induced schools to shift to remote learning. We use
data from Zearn to measure educational progress during the pandemic.

## Structure of Initial Data

We receive data from Zearn on the number of students using Zearn Math
and student progress in Zearn Math as measured by the number of lessons
on the platform completed by students in a given week for a given school.

## Internal Data Processing

### *Reducing the effects of transitory outliers on the series* {-}

To reduce the effects of transitory outliers, we replace the value
of any week for a given school that reflects a 50% decrease (increase)
greater than the week before or after it with the mean value for the
three relevant weeks.

### *Accounting for missing values* {-}

If a school is actively using Zearn at any point in a given semester,
then for any week where we observe no reported values for both the
number of students using Zearn Math and for student progress in Zearn
Math, we impute these missing values as zeros.

## Masking and Publication

The data we obtain are masked such that any county with fewer than
two districts, fewer than three schools, or fewer than 50 students
on average using Zearn Math during the preperiod of January 6 to February 7, 2020 is excluded.
We fill in these masked county statistics with the commuting zone mean whenever possible. 
We winsorize values reflecting an increase of greater than 300% at the school level.
We exclude schools which did not have at least 5 students using Zearn
Math for at least one week during January 6 to February 7, 2020.
After taking these steps, we aggregate to the county, state, and national
level, in each case weighting by the average number of students using
the platform at each school during the base period of January 6 to February
7, 2020, and we normalize relative to this base period to construct
the indices we report.

## Representativeness of Zearn Data

We assess the representativeness of the Zearn data in Appendix Table
\ref{tab:zearn_demographic} by comparing the demographic characteristics
of the schools for which we obtain Zearn data (based on the ZIP codes
in which they are located) to the demographic characteristics of K-12
students in the U.S. as a whole, as measured in the American Community
Survey. The distribution of income, education, and race and ethnicity
of the schools in the Zearn sample is similar to that in the U.S.
as a whole, suggesting that Zearn provides a representative picture
of online learning for students in the U.S.

# Dates and Geographic Definitions {#sec:Dates-and-Geographic}

In this appendix, we provide additional details about how we define
key dates and geographic units used in our analysis.

*Key Dates for COVID-19 Crisis.* The Economic Tracker includes
information about key dates relevant for understanding the impacts
of the COVID-19 crisis. At the national level, we focus on three key
dates:

- First U.S. COVID-19 Case: 2020-01-20
- National Emergency Declared: 2020-03-13
- CARES Act Signed in to Law: 2020-03-27

At the state level we collect information on the following events:

- Schools closed statewide: Sourced from COVID-19 Impact: School Status
Updates by MCH Strategic Data, available [here](https://www.mchdata.com/covid19/schoolclosings).
Compiled from public federal, state and local school information and
media updates.

- Nonessential businesses closed: Sourced from the Institute for Health
Metrics and Evaluation state-level data (available [here](https://covid19.healthdata.org/united-states-of-america/)),
who define a non-essential business closure order as: “Only locally
defined `essential services' are in operation. Typically, this results
in closure of public spaces such as stadiums, cinemas, shopping malls,
museums, and playgrounds. It also includes restrictions on bars and
restaurants (they may provide take-away and delivery services only),
closure of general retail stores, and services (like nail salons,
hair salons, and barber shops) where appropriate social distancing
measures are not practical. There is an enforceable consequence for
non-compliance such as fines or prosecution.”

- Stay-at-home order goes into effect: Sourced and verified from the
New York Times reopening data, available [here](https://www.nytimes.com/interactive/2020/us/coronavirus-stay-at-home-order.html),
and hand-collection from local news and government sources where needed.

- Stay-at-home order ends: Sourced and verified from the New York Times
reopening data, available [here](https://web.archive.org/web/20200701015308/https://www.nytimes.com/interactive/2020/us/states-reopen-map-coronavirus.html),
and hand-collected from local news and government sources where needed.
Defined as the date at which the state government lifted or eased
executive action or other policies instructing residents to stay home.
We code “regional” and “statewide” expiry of stay-at-home
orders separately. A “regional” expiration of a stay-at-home orders
occurs when a stay-at-home order expires in one region within a state,
but not everywhere within the state. A “statewide” expiration
of a stay-at-home order occurs when a stay-at-home order first expired
throughout a whole state, either due to a statewide change in policy,
or due to the stay-at-home order in each county having expired.

- Partial business reopening: Sourced and verified from the New York
Times reopening data, available [here](https://www.nytimes.com/interactive/2020/us/states-reopen-map-coronavirus.html),
and hand-collection from local news and government sources where needed.
Defined as the date at which the state government allowed the first
set of major industries to reopen (non-essential retail or manufacturing
in nearly every case). Deviations from the New York Times reopening
data are deliberate and usually involve our regional classification
or our inclusion of manufacturing. A “regional” reopening occurs
when businesses are allowed to reopen in one region within a state,
but not everywhere within the state. A “statewide” reopening occurs
when businesses are allowed to reopen throughout a whole state, either
due to a statewide change in policy, or due to restrictions being
eased in each individual county.

*Geographic Definitions.* For many of the series we convert
from counties to metros and ZIP codes to counties. We use the HUD-USPS
ZIP code Crosswalk Files to convert from ZIP code to county. When
a ZIP code corresponds to multiple counties, we assign the entity
to the county with the highest business ratio, as defined by HUD-USPS
ZIP Crosswalk. We generate metro values for a selection of large cities
using a custom metro-county crosswalk, available in Appendix Table
\ref{tab:city_crosswalk}. We assigned metros to counties and ensured
that a significant portion of the county population was in the metro
of interest. Some large metros share a county, in this case the smaller
metro was subsumed into the larger metro. We use the Uniform Data
Systems (UDS) Mapper to crosswalk from ZIP codes to ZCTAs.

# Analysis of Economic Impact Payments {#sec:Appx-Stimulus-Policy}

This appendix presents technical details for our analysis of the effects
of the three rounds of economic impact (i.e., “stimulus”) payments
on consumer spending.

## Construction of the Data {#subsec:Stimulus-Policy-Data}

In order to prepare the spending data for analysis of the stimulus,
we first aggregate the raw data to calculate total card spending on
each date, in each ZIP code income quartile. See Section \ref{subsec:Data-Spending}
and Appendix \ref{sec:Appx-Affinity} for more details on the source
of the consumer spending series. Note that unlike in our baseline
spending series, we use total spending (thereby including both intensive
margin changes in spending per card and extensive margin changes in
the number of cards) to ensure consistency in comparisons of our estimates
across different stimulus rounds.

We then index spending in each ZIP code income quartile relative to
spending between January 4 to 31, 2019. We include data from January 2020
in the “control” series, normed to January 2019, for the purpose
of estimating the effects of the 2nd stimulus in January 2021.

We residualize indexed spending in each ZIP code income quartile with
respect to day-of-week fixed effects calculated using 2019 data. For
the April 2020 and March 2021 stimulus payments, we further residualize
spending with respect to a linear pre-trend estimated in the pre-period
data pooled across income quartiles; this procedure adjusts for economic
trends related to the pandemic that differ from the prior year (and
thus remain in the indexed data). We do not adjust for a linear pre-trend
for the January 2021 stimulus payment due to the omission of the holiday
period immediately before the stimulus payments.

Figure \ref{fig:stimulus_eventstudy} plots these data, differenced
treatment minus control for each day relative to the stimulus events.
Appendix Figure \ref{fig:stimulus_not_detrended} depicts analogous
plots without adjusting for pre-trends in any of the three stimulus
payment periods. As discussed in Section \ref{subsec:Stimulus-Eval},
our analysis yields similar conclusions regardless of the linear adjustment
for pre-trends.

## Calculation of Effects of Stimulus Payments on Consumer Spending

We estimate the effects of each stimulus payment on indexed consumer
spending using a difference-in-differences approach (Appendix Table
\ref{tab:policy_stimulus}). To capture the non-linear dynamics evident
in the non-parametric figure, we estimate separate treatment effects
for the first five days (Column 1) and from the 6th day onwards (Column
2). This window runs through the 25th day for the stimulus payments
in April 2020 and March 2021, and through the 16th day for the stimulus
payment in January 2021 (reflecting the data available at the time
of our “real-time estimate”). Finally, we sum these effects to
estimate the causal effect of the stimulus on spending the month after
checks were sent out, under the identification assumption that daily
trends in spending in 2020 would have matched those in 2019 (up to
a linear difference in pre-trends for the first and third stimulus).
We convert these estimates into a projected “first month” effect,
measured in percentage points of baseline spending levels, by assuming
the daily estimate from the second treatment coefficient within each
stimulus continues through the 31st day (Column 3).

## Calculating Spending Changes per Check Recipient {#subsec:Stimulus-Policy-Rescaling}

We map percentage point increases in spending into dollars per $1,200
check in three steps.

First, we translate the percentage point impact estimated from our
difference-in-differences model ($\beta$) for each ZIP-income quartile into total additional dollars of spending. 
The rescaled total dollar effect $\tilde{\beta}$ is $\beta$ $\times$ 
that ZIP-income quartile's share of total spending in the Affinity data in January 2019 $\times$ 
total card spending in NIPA Table 2.3.5 in January 2019 (following the method described in Appendix Figure \ref{fig:affinity_nipa_marts}).  

Second, we calculate the total amount of stimulus dollars for which
residents of each ZIP-income quartile were eligible ($S$), which
is the number of households in each set of ZIP codes times the eligibility
rate. We calculate the number of households by household-type x income
bin x ZIP code income quartile from the ACS 2014-2018, where household-type
is the combination of single vs. married and number of children. We
use total reported income in the ACS as a proxy for adjusted gross
income (AGI), which determines eligibility for stimulus payments.
We assume that mean household size is constant across income bins
within each ZIP code for each household type (i.e. we assume the mean
household size of married and unmarried households does not vary by
income). This assumption permits us to combine these datasets to calculate
the number of people in each income bin x household-type x ZIP code
income quartile.

We then calculate the eligibility rate at the ZIP code level in three
steps: (A) computing the share of people in single households who
are eligible for stimulus payments; (B) computing the share of people
in joint-filing households who are eligible for stimulus payments;
and (C) combining these estimates using data on marriage rates to
create overall ZIP code-level eligibility rates. We assign fractional
eligiblity rates to reflect partial payments; for instance, a single
household with no children eligible for a $600 check during the first
stimulus (which paid $1,200 checks) would be assigned an eligibility
rate of 50%.

*(A) Eligibility among single households.* Single households
with incomes below $75,000 were eligible to receive the full stimulus
amount. The precise structure of payments depends on the number of
children within the household; for simplicity, we use the structure
of payments for families without children for all families. In practice,
since high-income households with children faced more lenient eligibility
requirements than we assume, our estimates represent an upper bound
for consumption per stimulus recipient.

For the April 2020 stimulus, single households earning $75,000-$99,000 were eligible for partial stimulus payments. 
For the January 2021 and March 2021 stimulus payments, the partial eligibility income ranges were $75,000-$87,000 and $75,000-$80,000 respectively.
However, the most granular income bins available in the ACS 2014-2018 data at a ZIP code level do not disaggregate
between households earning $75,000-$99,999. As such, to assign eligibility rates for single households in this income bin, 
we assume that incomes are uniformly distributed within each ZIP code x income bin. We then assign these households an eligibility rate equal to 
$0.5 \times \frac{99{,}000-75{,}000}{99{,}999-75{,}000}$ for the April 2020 stimulus; $0.5 \times \frac{87{,}000-75{,}000}{99{,}999-75{,}000}$
for the January 2021 stimulus; and $0.5 \times \frac{80{,}000-75{,}000}{99{,}999-75{,}000}$ for the March 2021 stimulus.

We assign an eligibility rate of zero to single households earning above the partial eligibility income range.

*(B) Eligibility among married households.* We treat married
households as joint-filers. Married households with incomes below
$150,000 were eligible to receive the full stimulus amount. As above,
we do not account for the effects of number of children within the
household on phase-out.

For the April 2020 stimulus, joint-filer households earning $150,000-$198,000 were eligible for partial stimulus payments. 
For the January 2021 and March 2021 stimulus payments, the partial eligibility income ranges were $150,000-$174,000 and $150,000-$160,000 respectively.
The most granular income bins available in the ACS 2014-2018 data at a ZIP code level do not disaggregate
between households earning $150,000-$199,999.
As before, to assign eligibility rates for married households in this income bin, we assume that incomes are uniformly distributed within each ZIP code x income bin. 
We then assign these households an eligibility rate equal to $0.5 \times \frac{198{,}000-150{,}000}{199{,}999-150{,}000}$ for the April 2020 stimulus; $0.5 \times \frac{174{,}000-150{,}000}{199{,}999-150{,}000}$
for the January 2021 stimulus; and $0.5 \times \frac{160{,}000-150{,}000}{199{,}999-150{,}000}$ for the March 2021 stimulus.

We assign an eligibility rate of zero to married households earning above the partial eligibility income range.

*(C) ZIP Code-level eligibility rates.* Steps (A) and (B) allow
us to assign eligibility rates to each income bin x ZIP code x household
type (i.e. single vs. married). We then turn to calculating the composition
of household types within each income bin. To do so, we first regress
marriage rates on median household income at the ZIP code level, weighting
by population; we assume the ZIP code-level relationship between marriage
rates and household income approximates the individual-level relationship
between marriage rates and household income. We then assign a marriage
rate equal to the fitted value of marriage rates at the midpoint of
each household income bin. This allows us to estimate eligibility
rates within each ZIP code x income bin. Finally, we calculate a population-weighted
mean eligibility rate within each ZIP code income quartile. After
that, we multiply this rate by $1,200 and by the population within
each ZIP code income quartile to calculate total expected stimulus
spending in each ZIP code income quartile.

Third, we adjust for the actual fraction of stimulus checks paid during
the treatment period ($f$). To do so, we calculate the share of the
stimulus payments distributed by the first week of the treatment period
using Daily Treasury Statements. For instance, the total amount disbursed
for stimulus payments under the CARES Act of April 2020 is $271.4
billion [@irs_databook_2021]. We use Daily Treasury Statements
to calculate that total spending on stimulus payments prior to April
21, 2020 was roughly ${{stim1_apr_pre}} billion. We therefore find
that roughly {{share_distributed_april}}% of payments had been
distributed by April 21. A similar calculation for payments made under
the COVID-related Tax Relief Act of 2020 yields estimates that roughly
{{share_distributed_january}}% of payments were distributed
by January 10, 2021, and that {{share_distributed_march}}% of payments
made under the American Rescue Plan had occurred by March 23, 2021.
We assume that these rates are constant across ZIP code income quartiles;
it is possible that this fraction is higher for higher income households,
due to the larger number of low-income households without bank information
on file at the IRS, in which case we would overestimate the total
stimulus payments in the lowest ZIP code income quartile, leading
to an underestimate of the amount spent per recipient for these households.

Combining these three steps, our estimate of the dollars spent per stimulus payment 
is $\tfrac{\tilde{\beta}}{S \cdot f}$, reported in Appendix Table \ref{tab:policy_stimulus}, Column 4. 
Our final estimate of the dollars spent per $1,200 received is $\tfrac{\$1,200}{\$X} \cdot \tfrac{\tilde{\beta}}{S\cdot f}$,
where $X is the full stimulus payment amount for each stimulus ($1,200 for the April 2020 stimulus; 
$600 for the January 2021 stimulus; and $1,400 for the March 2021 stimulus).
These estimates are reported in Figure \ref{fig:stimulus_effect_sizes}
and Appendix Table \ref{tab:policy_stimulus}, Column 5.

# Supplemental Policy Analyses {#sec:Supplemental-Analyses}

## State-Ordered Reopenings

Many states enacted stay-at-home orders and shutdowns of businesses
in an effort to limit the spread of COVID infection and later reopened
their economies by removing these restrictions. We examine how these
executive orders affected economic activity by exploiting variation
across states in the timing of shutdowns and reopenings.

Throughout this section, we define the reopening date to be the day
that a state *began* the reopening process (see Appendix \ref{sec:Dates-and-Geographic}
for details). In most states, reopening was a gradual process in which
certain industries and types of businesses opened before others, but
there was a lot of heterogeneity across states in the precise form
that the reopening took. Our estimates should therefore be viewed
as an assessment of the average impact of typical reopening efforts
on aggregate economic activity; we defer a more detailed analysis
of how different types of reopenings affected different sectors (which
can be undertaken with the data we have made publicly available) to
future work.

We begin with a case study comparing Colorado and New Mexico that
is representative of our broader findings. These two states both issued
stay-at-home orders during the final week of March 2020 (New Mexico on
March 24, Colorado on March 26). Colorado then partially reopened
its economy, permitting retail and personal service businesses to
open to the public, on May 1, 2020, while New Mexico did not reopen until
two weeks later, on May 16. Appendix Figure \ref{fig:colorado_newmexico}
plots consumer spending (using the Affinity Solutions data) in Colorado
and New Mexico. Spending evolved nearly identically in these two states:
in particular, there is no evidence that the earlier reopening in
Colorado boosted spending during the two intervening weeks before
New Mexico reopened.

Appendix Figure \ref{fig:reopenings_eventstudies} generalizes the
case study in Appendix Figure \ref{fig:colorado_newmexico} by studying
partial reopenings in the five states that issued such orders on or
before April 27, 2020. For each reopening date (April 20, 24 and 27), 
we compare the trajectory of spending in treated states to a group of control states 
that had not reopened as of three weeks after the treated state reopened. We select multiple control
states (listed in Appendix Table \ref{tab:reopenings_states}) for
each of the reopening dates by matching on pre-period indexed spending
(relative to January) during the three weeks prior to reopening. Specifically, for each reopening date $t$, 
we estimate each state's rank (ranging from 0 to 1) in the distribution of mean indexed spending pooling weeks $t-1$, $t-2$, and $t-3$.
We then select control states with rank within $0.2$ of the treated states' mean rank, separately for each reopening date.
We then calculate unweighted means of the outcome variables in the control
and treatment states to construct the two series for each reopening
date. Finally, we pool these three event studies together (redefining
calendar time as time relative to the reopening date) to create Appendix
Figure \ref{fig:reopenings_eventstudies}.

As in the case study of Colorado vs. New Mexico, the trajectories
of spending in the treated states almost exactly mirror those in the
control states. We formalize the estimate from this design using a
difference-in-differences (DD) design that compares the two weeks
before the reopening in the treated states and two weeks after. We
estimate that reopenings led to a {{reop_spend_did}} percentage
point increase in spending. This DD estimate also appears in Appendix
Table \ref{tab:reopening_effects}, Column 1. Column 2 replicates
that specification with a three-week analysis window; the DD estimate is virtually 
unchanged at {{did_beta_spend_all_threeweeks}} percentage points. Appendix Figure \ref{fig:reopenings_eventstudies}
shows that we also find little impact of reopenings on employment
(using the Paychex-Intuit data). Finally, Appendix Figure \ref{fig:reopenings_eventstudies}
also shows (using data from Womply) that there was a {{reop_merch_did}}
percentage point increase in the fraction of small businesses open
after states allowed businesses to reopen -- confirming that state
orders did have some mechanical impact on the fraction of businesses
that were open. However, this mechanical effect does not appear to
translate to noticeable impacts on total employment or spending.

In line with these small treatment effect estimates, reopenings accounted
for a relatively small share of the overall variation in economic
conditions across states. To demonstrate this, we first calculate
the actual variance in spending levels and other outcomes across states.
We then counterfactually add our estimated effect of reopening to
all states that were not yet open as of May 18, 2020, and recalculate the
variance. Appendix Figure \ref{fig:reopenings_variance} then plots
1 minus the ratio of the counterfactual variance to the actual variance,
which is a measure of the importance of early reopenings in explaining
the variation in economic activity observed on May 18. These ratios
are very low, showing that early reopenings did not play an important
role in explaining why some states had stronger employment trajectories
than others.[^reopenings_ate] These results are consistent with the findings of other contemporaneous
studies showing that little of the state-level variation in employment,
job vacancies, or time spent outside home is related to state-level
stay-at-home orders or business closures [@BartikRothsteinHomebase;
@forsythe2020labor; @lin2020health; @goolsbee2021fear; @VillaBoasSearHashtag].

[^reopenings_ate]: We emphasize that these results apply to *average* employment
rates and are thus not inconsistent with evidence of modest impacts
in specific subsectors, particularly at higher wage levels, as identified
e.g., by @ADPHurstetal.

Why did these reopenings have so little immediate impact on economic
activity? The evidence in Section \ref{sec:Impacts} suggests that
health concerns among consumers were the primary driver of the sharp
decline in economic activity in March and April 2020. Consistent with that
evidence, spending fell sharply in most states *before* formal
state closures (Appendix Figure \ref{fig:closures}). If individuals'
own health concerns are the core driver of reductions in spending
during pandemics, governments may have limited capacity to mechanically
restore economic activity through reopenings if those reopenings are
not interpreted by consumers as a signal of reduced health risks.[^reopenings_spillovers]

[^reopenings_spillovers]: In this vein, we stress that our research design only identifies the
impacts of individual states opening earlier vs. later; if one state's
actions impact behavior in other states (e.g., by shaping perceptions
about health risks), the total impacts of shutdowns or reopenings
at a national level could be larger. Moreover, these conclusions only
apply to the initial stages of the pandemic that we study here. If
health concerns diminish over time (e.g., due to quarantine fatigue),
government restrictions could have larger effects on economic activity.

## Paycheck Protection Program: Loans to Small Businesses {#subsec:Paycheck-Protection-Program}

The Paycheck Protection Program (PPP) sought to reduce employment
losses by providing financial support to small businesses. Congress
appropriated nearly $350 billion for loans to small businesses in
an initial tranche paid beginning on April 3, 2020, followed by another
$175 billion in a second round beginning on April 27, 2020. The program
offered loan forgiveness for businesses that maintained sufficiently
high employment (relative to pre-crisis levels).

According to the @ppp_hearing, the stated primary purpose
of the PPP was to encourage businesses to maintain employment even
as they lost revenue. The @sba_statement emphasized the
employment impacts of the PPP as a key measure of the program's success,
noting that the PPP “ensure\[d\] that over approximately 50 million
hardworking Americans stay\[ed\] connected to their jobs” based
on self-reports of the number of jobs retained by firms that received
PPP assistance.

Here, we study the marginal impacts of the PPP on employment directly
using payroll data from Paychex and Earnin, exploiting the fact that
eligibility for the PPP depended on business size. Firms with fewer
than 500 employees before the COVID crisis qualified for PPP loans,
while those with more than 500 employees generally did not. One important
exception to this rule was the food services industry, which was treated
differently because of the prevalence of franchises. We therefore
omit the food services sector from the analysis that follows.[^ppp_food]

[^ppp_food]: According to SBA data on PPP receipt throughout the life of the program,
10.5% of total PPP loan volume (7.1% of the total number of loans)
was disbursed to firms in the food services sector (NAICS 72). The
remaining exceptions to this rule affect relatively few workers: omitting
food services, more than 90% of employees work at firms that face
the 500 employee threshold for eligibility.

We estimate the causal effect of the PPP on employment rates at small
businesses using a difference-in-differences research design, comparing
trends in employment for firms below the 500 employee cutoff (the
treated group) vs. those above the 500 employee cutoff (the control
group) before vs. after April 3, 2020, when the PPP program began.[^ppp_erc] 
We do not condition on firm survival and simply count the total number
of employees still working in each week at firms that initially had
more than 500 vs. less than 500 employees. Our estimates thus take
both the intensive (reductions in employees for surviving firms) and
extensive (firm closure) margins into account.

[^ppp_erc]: Firms with more than 500 employees were still eligible for the Employee
Retention Credit (ERC), which gave all firms that lost more than 50%
of their revenue a tax credit worth up to $5,000 per employee if
they did not take up the PPP. While data on ERC takeup are unavailable,
fewer than 10% of CFOs of large firms report revenue losses larger
than 25% [@pwcpulsesurvey], suggesting that the vast majority
of firms with more than 500 employees were not eligible for the ERC
and hence serve as a valid counterfactual for employment in the absence
of government assistance.

Appendix Figure \ref{fig:emp_ppp} plots the average change in employment
rates (inferred from payroll deposits) relative to January 2020 for firms
employing 100-499 employees, which were eligible for PPP loans, vs.
firms employing 500-799 employees, which were generally ineligible
for PPP loans, combining data from Paychex and Earnin.[^paychex_earnin] 
To adjust for the fact that industry composition varies across firms
of different sizes, we reweight by two-digit NAICS code so that the
distribution of employees across industries in the below-500 and above-500 employee
groups matches the overall distribution of employees across industries in January 2020. 
We further control for county x wage quartile x week fixed effects to
account for the differential time patterns of employment rates by
county and wage quartile shown in Section \ref{subsec:Impacts-Employment}.

[^paychex_earnin]: We report estimates pooling Paychex and Earnin because our data use
agreements do not permit us to report results based solely on Paychex
data, and Intuit does not have coverage around the 500 employee cutoff.

Before April 3, 2020, trends in employment were similar among eligible vs. ineligible
firms, showing that larger businesses provide a good counterfactual
for employment trends one would have observed in smaller firms absent
the PPP program (conditional on the reweighting and controls described
above). After April 3, employment in the treated ($<500$ employees) and control ($\geq500$
employees) groups diverge and follow slightly different trajectories until August 2020, 
after which employment rates in the two groups are essentially identical again. These findings imply that
the PPP program had little marginal impact on employment at small
businesses under the identification assumption that employment trends
in the two groups would have remained similar absent the PPP.

Appendix Figure \ref{fig:emp_firmsize} plots the change in employment
from January 4 to 31, 2020 to June 2020 by firm size bin. The
decline in employment is quite similar across firm sizes, and is not
markedly smaller for firms below the 500 employee eligibility threshold.[^ppp_discontinuity]

[^ppp_discontinuity]: Because of differences in the measurement of firm sizes in our data
and the SBA data used to determine PPP eligibility (see below), there
is no sharp discontinuity in eligibility at the 500 cutoff. Hence,
we do not interpret this plot using an RD design, but rather view
it as showing that our estimates are insensitive to the bandwidth
used to define the treatment and control groups in the DD analysis.

In Appendix Table \ref{tab:ppp_effects}, we quantify the impacts
of the PPP using OLS regressions of the form:
```{=latex}
\begin{equation}
\text{Emp}_{scqit} = \alpha_{cqt} + \delta\text{Eligible}_{s} + \beta_{DD}\text{Eligible}_{s} \cdot \text{Post-PPP}_{t} + \varepsilon_{scqit},\label{eq:PPP_reg_spec}
\end{equation}
```

where $\text{Emp}_{scqit}$ is the change in employment within each
eligibility group $s$ $\times$ county $c$ $\times$ wage quartile
$q$ $\times$ 2-digit NAICS industry $i$ cell on week $t$, relative
to January 4 to 31, 2020; $\text{Eligible}_{s}$ is an indicator variable
for whether firm had fewer than 500 employees in the pre-COVID period;
$\text{Post-PPP}_{t}$ is an indicator variable for the date being
on or after April 3, 2020; and $\alpha_{cqt}$ represents a county-wage
quartile-week fixed effect. We estimate this regression on the sample of
firms with 100-799 employees using data from March 11 to August 15, 2020.
We focus on employment impacts up to August 15 because Appendix Figure
\ref{fig:emp_ppp} suggests that employment rates in the two groups
converged after early August (extending the estimation window would
only further reduce the estimated impacts of the PPP). We reweight by two-digit NAICS code so that the
distribution of employees across industries in the below-500 and above-500 employee
groups matches the overall distribution of employees across industries in January 2020. 
We cluster standard errors at the county-industry-eligibility group level to permit correlation
in errors across firms and over time within counties and estimate
the regression using OLS, weighting by the total number of employees
in the cell from January 4 to 31, 2020.

Column 1 of Appendix Table \ref{tab:ppp_effects} presents the baseline estimate obtained from regression equation
(\ref{eq:PPP_reg_spec}) of $\beta_{DD}=$ {{ppp_eligible_beta}} percentage points (s.e. = {{ppp_eligible_se}}), an
estimate that matches Appendix Figure \ref{fig:emp_ppp} and is similar
to that obtained in confidential ADP data in contemporaneous work
by @autor2020evaluation. The mean decline in employment
among firms in the control group up to August 15, 2020 was {{ppp_control_decline_aug15}} percentage points, implying
that the PPP saved {{ppp_jobs_saved_pct}}% of the jobs that would otherwise have been
lost between April and August 2020. In Column 2, we reduce the bandwidth
to focus more narrowly around the 500-employee size threshold; the
estimate is attenuated but not statistically distinguishable from that in Column 1. 

Our difference-in-differences research design identifies the causal
effect of the PPP on eligible firms under the assumption that the
PPP did not have a causal effect on employment at PPP-ineligible firms.
It is possible that the PPP reduced employment at ineligible firms
(relative to the no-PPP counterfactual) through an employment substitution
channel: ineligible firms might have hired workers laid off from eligible
firms in the absence of the PPP. In the presence of such substitution,
our DD estimate would overstate the causal effect of the PPP on employment
at small businesses, providing an upper bound for its partial equilibrium
impact (ignoring general equilibrium effects that may have influenced
consumer demand and employment at all firms).

*Measurement Error in Firm Sizes.* Our measures of firm size
-- which are based on employment levels in 2019 from Dun & Bradstreet
data for the Paychex sample and ReferenceUSA in the Earnin sample
-- do not correspond perfectly to the measures used by the Small
Business Administration to determine PPP eligibility. Such measurement
error in firm size attenuates the estimates of $\beta_{DD}$ obtained
from (\ref{eq:PPP_reg_spec}) relative to the true causal effect of
PPP eligibility because some of the firms classified as having more
than 500 employees may have actually received PPP (and vice versa).

We estimate the degree of this attenuation bias by matching our data
on firm sizes to data publicly released by the Small Business Administration
(SBA) on a selected set of PPP recipients and assessing the extent
to which firms are misclassified around the threshold. We restrict
attention to firms receiving loans of at least $150,000, as the names
and addresses of these firms are publicly available from the SBA.
We first geocode addresses recorded in SBA and ReferenceUSA-Dun &
Bradstreet data to obtain a latitude and longitude for each firm.
We then compute the trigram similarities between firm names for all
SBA and Dun & Bradstreet firms within twenty-five miles of another.
We then select one “match” for each PPP recipient from the Dun
& Bradstreet data for Paychex sample and ReferenceUSA for the Earnin
sample, among the subset of firms within twenty-five miles. For firms
with loans of above $150,000, exact loan size is not observed; we
impute loan size as the midpoint of loan range. The SBA released firm
names and ZIP codes of PPP recipients receiving over $150,000 in
loans, which represent 72.8% of total PPP expenditure. Of the roughly
660,000 PPP recipients of these loans, we merge around 60% of firms
and 62% of total expenditure to firm size data. In this matched subset,
we find that mean PPP expenditure per worker is $2,303 for firms
we classify as having 100-499 employees and $586 per worker for firms
with 500-799 employees (excluding firms in the food services industry).
Given that we match only 62% of the publicly available PPP expenditure
to our data and the publicly available data covers only 73% of total
PPP expenditure, this implies that firms measured as having 100-499
employees in our sample received $\frac{\$2{,}303}{0.62\times0.73}=\$5{,}090$
of PPP assistance per worker, while firms with 500-799 employees received
$\frac{\$586}{0.62\times0.73}=\$1{,}290$ in PPP assistance per worker.[^size_misclassification] 
We calculate that PPP assistance to eligible firms with between 100
and 799 employees (excluding NAICS 72) is $\$5{,}092$ per worker on
average.[^ppp_assistance] Hence, firms with 500-799 workers in the ReferenceUSA-Dun & Bradstreet
data (the control group) were effectively treated at an intensity
of $\frac{\$1{,}290}{\$5{,}092}=25.3\%$, whereas firms with 100-499 workers
in the ReferenceUSA-Dun & Bradstreet data (the treatment group) were
treated at an intensity of $\frac{\$5{,}090}{\$5{,}092}=100\%$. Inflating
our baseline reduced-form estimates by $\frac{1}{(1-0.253)}=1.35$
yields estimates of the treatment effect of PPP eligibility adjusted
for attenuation bias due to mismeasurement of firm size.

[^size_misclassification]: This calculation assumes that the degree of misclassification of eligibility
among identifiable PPP recipients matches the degree of misclassification
of eligibility in the broader ReferenceUSA sample.

[^ppp_assistance]: To compute this statistic, we first calculate the share of total loan
amounts received by non-NAICS 72 firms in the publicly released SBA
data. We begin by imputing precise loan amount as the midpoint of
minimum and maximum of loan range, where precise loan amount is not
released. We then calculate the share of loans in firms with firm
size between 100 and 499, in NAICS codes other than NAICS 72, under
the assumption that our merge rate is constant by firm size. Using
this approach, we calculate that 13.1% of PPP loan spending was allocated
to non-NAICS 72 firms with 100-499 employees. We then
rescale the total PPP expenditure to the end of June 2020, $521 billion,
by 0.131 to arrive at an estimate of $68.25 billion in PPP loan spending
to non-NAICS 72 firms with 100-499 employees. Finally, we divide $68.25
billion by the number of workers at non-NAICS 72 firms with 100-499
employees to arrive at an estimate of loan spending per worker.

Under standard assumptions required to obtain a local average treatment
effect in the presence of non-compliance -- no direct effect of being
classified as having more than 500 workers independent of the PPP
and a monotonic treatment effect -- we can estimate the LATE of the
PPP on employment rates by multiplying the raw estimates reported
in Appendix Table \ref{tab:ppp_effects}, Column 1 by 1.35 [@angrist1996identification].
This gives us a final preferred point estimate for the effect of PPP
eligibility on employment of {{ppp_eligible_late}} percentage points.

*Costs Per Job Saved.* Using Statistics of U.S. Businesses
(SUSB) data, we calculate that approximately 62.4 million workers
work at firms eligible for PPP assistance (53.7 million workers excluding
those in the food services industry, NAICS 72). Thus 86.1% of total
PPP expenditure was received by non-NAICS 72 firms. We then multiply
this share by total PPP expenditure as of August 8, 2020 to reach an estimate
of $486 billion in non-NAICS 72 firms. Under the assumption that
the PPP's effects on firms with between 100 and 499 employees were
the same in percentage point terms as the PPP's effects on all eligible
firms, our baseline estimates in the combined Paychex-Earnin data
(Appendix Table \ref{tab:ppp_effects}, Column 1), adjusted for attenuation
bias, imply that the PPP saved {{ppp_eligible_late_shares}} $\times$ 53.6M = {{ppp_jobs_saved}} million jobs
from April through August 15, 2020.[^ppp_food_hypothetical] 
Given a total expenditure on the PPP program of $486 billion through
August 8 (excluding firms in food services), this translates to an
average cost per job saved (over the five months between April and
August 2020) of ${{ppp_cost_per_job_saved}}. Even at the upper bound of the 95% confidence
interval for employment impact, we estimate a cost per job saved of ${{ppp_cost_per_job_saved_upperCI}}.

[^ppp_food_hypothetical]: If the treatment effect of the PPP program on food services were the
same in percentage terms as in other sectors, we estimate the PPP
saved a total of {{ppp_jobs_saved_all}} million jobs.

In order to compute net costs to government per job saved, we account
for the fact that a reduction in job losses decreases UI spending.
We evaluate replacement rates at the mean level of earnings for workers
employed at PPP-eligible firms using the statutory rates in Ganong, Noel,
and Vavra [-@ganong2020us, Figure 3a], which estimates that displaced workers received
roughly 120% of weekly earnings for the seventeen weeks between the
beginning of our treatment period (April 3, 2020) and the end of July
2020, and received roughly 40% of weekly earnings for the following
two weeks until the end of our analysis window (August 15, 2020). Computing
expenditure on UI given these replacement rates and mean earnings,
we find that the effect of each job saved by the PPP on UI payments
was $18,350 over our analysis period. Netting these savings out of
the gross cost, we estimate a net cost to the government of ${{ppp_cost_per_job_saved_w_UI}}
per job saved (and ${{ppp_cost_per_job_saved_upperCI_w_UI}} at the upper bound of the 95% confidence
interval for employment impact). For comparison, mean annual earnings for workers at PPP-eligible firms are only $45,000.

\clearpage
